An Integrated Approach: A Hybrid Machine Learning Model for the Classification of Unscheduled Stoppages in a Mining Crushing Line Employing Principal Component Analysis and Artificial Neural Networks

This article implements a hybrid Machine Learning (ML) model to classify stoppage events in a copper-crushing equipment, more specifically, a conveyor belt. The model combines Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs) with Principal Component Analysis (PCA) to identify the type of stoppage event when they occur in an industrial sector that is significant for the Chilean economy. This research addresses the critical need to optimise maintenance management in the mining industry, highlighting the technological relevance and motivation for using advanced ML techniques. This study focusses on combining and implementing three ML models trained with historical data composed of information from various sensors, real and virtual, as well from maintenance reports that report operational conditions and equipment failure characteristics. The main objective of this study is to improve the efficiency when identifying the nature of a stoppage serving as a basis for the subsequent development of a reliable failure prediction system. The results indicate that this approach significantly increases information reliability, addressing the persistent challenges in data management within the maintenance area. With a classification accuracy of 96.2% and a recall of 96.3%, the model validates and automates the classification of stoppage events, significantly reducing dependency on interdepartmental interactions. This advancement eliminates the need for reliance on external databases, which have previously been prone to errors, missing critical data, or containing outdated information. By implementing this methodology, a robust and reliable foundation is established for developing a failure prediction model, fostering both efficiency and reliability in the maintenance process. The application of ML in this context produces demonstrably positive outcomes in the classification of stoppage events, underscoring its significant impact on industry operations.


Introduction
In recent years, the evolution of machinery and industrial assets has brought about changes in maintenance management.This process has undergone successive transformations over time [1], driven by more complex designs, new maintenance methods, and a changing perspective on maintenance organisation.Due to this evolution, the industrial sector focusses on finding increasingly precise ways to generate maintenance plans that align with the requirements of different industries.One of the most commonly used maintenance plans is preventive maintenance, which involves prescheduled activities to minimise unexpected damage, reduce production downtime due to failures, and consequently lower associated costs [2].

Literature Review
To address the research problem and its development, it is necessary to understand the concept of preventive maintenance policies, along with the ML techniques and complimentary performance metrics employed throughout this research.

Preventive Maintenance
Maintenance is the combination of all technical, administrative and managerial actions during the life cycle of an item intended to retain it in, or restore it to, a state in which it can perform the required function [15].In this context, different maintenance policies, such as preventive and condition-based policies [16], define the guidelines based on which a company is to plan all its maintenance actions.This article focusses on preventive maintenance [17], which is the set of maintenance activities performed at predetermined time intervals or according to prescribed criteria to reduce the probability of failure or degradation of the functionality of an item.In addition, predictive maintenance aims to predict the optimal time point for maintenance actions, considering information about the system's health state and historical maintenance data [18].
In recent years, condition monitoring systems for industrial equipment have become conventional due to lower costs and the increased reliability of sensors, data transmission, and storage devices.Simultaneously, the IoT has enabled real-time transmission of this information on system conditions captured by different monitoring devices enabling fault detection [19].This development offers an excellent opportunity to use condition monitoring data intelligently within predictive maintenance [20].On the other hand, ML can be interpreted as the science of using computer algorithms to perform a specific task by the use of data without applying explicit instructions [21].

Artificial Neural Networks (ANNs)
Artificial Neural Networks are mathematical models inspired by the biological behaviour of neurons and how they are organised to form the brain's structure [22].They can be explained as a simplified model of the brain; this method is used to solve classification (pattern recognition, feature extraction, and image matching) and prediction problems.
The typical model is composed of layers with different perceptrons as represented in Figure 1.A perceptron, equivalent to an artificial neuron, is the fundamental unit of the network which handles computations to detect features.These layers include the input layer, which receives the input data or input features; the hidden layers, which processes the input of each perceptron from the preceding layer through a weighted sum and an appropriate activation function, which mathematically defines whether or not to send a signal to the next layer; and the output layer, which produces the network's response [23].
The following equation mathematically represents the output of the model: where x is the l-dimensional input vector w ij is the weight factor connecting the input neuron i to the hidden neuron j, w j is the weight factor connecting the hidden neuron j to the output neuron, β is the bias for the output neuron, ε is a random error, and δ is the activation function.The following equation mathematically represents the output of the model: where  is the l-dimensional input vector  is the weight factor connecting the input neuron  to the hidden neuron ,  is the weight factor connecting the hidden neuron  to the output neuron, β is the bias for the output neuron, ε is a random error, and δ is the activation function.

Support Vector Machine (SVM)
This statistical method was introduced by Vladimir Vapnik around 1960 and was initially focussed on binary classification, where hyperplanes separate data.SVM has been extended to handle multi-class classification, regression, and outlier detection.This algorithm is one of the most effective and widely used ML techniques [24].Figure 2 shows a set of observations that fall into two classes, red and blue, SVM is able to find a hyperplane that sets a boundary between them.

Support Vector Machine (SVM)
This statistical method was introduced by Vladimir Vapnik around 1960 and was initially focussed on binary classification, where hyperplanes separate data.SVM has been extended to handle multi-class classification, regression, and outlier detection.This algorithm is one of the most effective and widely used ML techniques [24].Figure 2 shows a set of observations that fall into two classes, red and blue, SVM is able to find a hyperplane that sets a boundary between them.This algorithm generates a separation hyperplane that maximises the margin between two datasets according to their classes.Two parallel hyperplanes are established on either side of the separation hyperplane to create the margin.The hyperplane can be described as follows: The optimal solution is obtained by reaching the highest possible margin.To find the optimal separation hyperplane, the SVM algorithm must maximise this margin while adhering to constraints by solving the following quadratic optimisation problem [25].This algorithm generates a separation hyperplane that maximises the margin between two datasets according to their classes.Two parallel hyperplanes are established on either side of the separation hyperplane to create the margin.The hyperplane can be described as follows: Sensors 2024, 24, 5804 The optimal solution is obtained by reaching the highest possible margin.To find the optimal separation hyperplane, the SVM algorithm must maximise this margin while adhering to constraints by solving the following quadratic optimisation problem [25].
Equation ( 3) is for data that can be separated linearly; this article considers data that are separated nonlinearly, making it necessary to perform a transformation on the data, to find a hyperplane that classifies the categories.The mathematical function used for the transformation is the kernel function k(x, y) [26].The kernel function transforms the original input data to be converted into a higher-dimensional feature space.These kernel functions are essential to the operation of SVMs as they are able to address nonlinear classification problems, where a hyperplane in the original space cannot effectively separate the data.In general, a kernel function projects the data from a low-dimensional space to a higher-dimensional space [27].There are different types of kernels, and in this study, three will be applied to select the one that provides the best results.The selection of the appropriate kernel function is crucial because the kernel defines the feature space in which the training set examples will be classified [28].The selected kernel functions are as follows: Of the various kernels available in SVM, using linear, polynomial, and radial kernels among the multiple options available in SVM is based on their broad applicability and versatility in most classification problems.Limiting to these three kernels simplifies the model selection and tuning process without sacrificing the ability to address linear and nonlinear classification problems.

Principal Component Analysis (PCA)
A fundamental tool in modern data analysis, employed across multiple disciplines, is Principal Component Analysis (PCA).Introduced by Karl Pearson in 1901, it is a linear projection that minimises the average projection cost, defined as the mean squared distance between the data points and their projections [29].
As a dimensionality reduction technique, PCA transforms a set of possibly correlated variables into a smaller set of new orthogonal variables called principal components.The values of these new variables for the observations are referred to as factor scores and can be interpreted as projections of the observations onto the principal components [30].This technique is particularly useful for visualising and exploring high-dimensional datasets, as it can easily identify trends, patterns, or outliers [31].PCA is effective for data preprocessing in ML algorithms, as it extracts the most informative features from large datasets while preserving the most relevant information from the initial dataset [32].By projecting a highdimensional dataset into a smaller feature space, PCA minimises or eliminates common issues such as multicollinearity and overfitting.
The mathematical formulation [30] begins with the following: Let X = [x 1 , . . . ,x n ] be the dataset to be analysed using Principal Component Analysis (PCA), where each column represents a single observation described by M variables.The sample mean vector x and the sample covariance matrix ∑ can be represented as follows: where . PCA computes factor scores as linear combinations of the original variables.
The optimal weight a 1 is obtained by maximising the variance in y 1i under the constraint a T 1 a 1 = 1.This leads one to find the eigenvector a1a_1a1 corresponding to the largest eigenvalue λ 1 o f Σ : Additional principal components are similarly obtained as eigenvectors corresponding to subsequent largest eigenvalues.For an L-dimensional projection, the optimal projection matrix A consists of the L eigenvectors of ∑ associated with the L largest eigenvalues λ 1 , . . . ., λ L :

Performance Metrics
To evaluate the model's effectiveness in classifying stoppages, metrics for comparing the different techniques and functions used are defined.One of the selected tools for evaluation includes the confusion matrix [33], which shows the number of correct and incorrect classifications made by the model on a dataset.See Table 1.Other metrics are derived from the confusion matrix as follows [33].Accuracy calculates the proportion of instances classified correctly (both true positives and true negatives) out of the total classifications [34].Accuracy is a vital ML metric that quantifies the model's overall correctness in making predictions.One of its significant advantages is its simplicity and ease of interpretation.It provides a clear, straightforward percentage of how many predictions were correct, making it a valuable metric for quickly assessing the general performance of a model.
Precision (P) will also be used, representing the number of true positives among the total number of positive predictions, which is crucial for minimising false positives.High precision indicates a low rate of false positives, which is important in applications where false positives are costly or undesirable.
Sensors 2024, 24, 5804 7 of 21 Recall (r) is a metric used to measure the fraction of true positive values compared to the total number of positive cases.High recall implies a low rate of false negatives, making it important when missing a positive instance has significant consequences.
Specificity, also known as the true-negative rate, is a fundamental metric in classification models, reflecting the model's ability to accurately identify negative cases.It is calculated as the proportion of true negatives (TN) to the sum of the true negatives and false positives (FP).Mathematically, the formula for specificity is The F-measure (FM) can be derived from the recall and precision metrics, representing the harmonic mean between precision and recall, and provides a balanced assessment of a model's performance, especially when precision and recall need to be balanced.It is beneficial when dealing with imbalanced datasets, as it combines false positives and negatives into a single score, helping in model evaluation and comparison.
Performance metrics play a key role in various ML applications.These metrics are used to effectively evaluate and compare models against each other, providing a solid basis for making decisions about which model and algorithm perform best in the area of interest.Their widespread use is due to their proven effectiveness in model evaluation [35][36][37][38], which contributes significantly to informed decision making in ML.
The choice of evaluation metrics in an ML model is paramount as it determines how performance is measured.Each metric, such as accuracy, precision, recall, and F-measure, provides a unique perspective on the model's quality and allows for tailored evaluation based on the specific problem; thus, by combining these metrics, a balanced approach is achieved, considering the model's accuracy and the minimisation of false positives or negatives, resulting in a more robust and effective model.

Problem Statement or Problem Formulation
The equipment to be investigated is a conveyor belt (CV-2C), which is part of a mining crushing line (Figure 3) that consists of a crusher where large pieces of material are reduced to smaller pieces which are later transported by conveyor belt "CV-1C" into feeder "FE002" which deposits the material into conveyor belt "CV-2C" that moves the load towards the rest of the process.
The conveyor belt CV-2C is a critical component in the crushing and material conveying line.Its correct operation is essential for maintaining the efficiency and productivity of the process.However, the conveyor belt has experienced several stoppages due to failures or operational decisions which are not informed in a timely manner.This has led to several delays when attempting to resume operation due to the unknown nature of the stoppage, which gives rise to the need for research on a classification model that can be later integrated into failure prediction systems, providing valuable input to anticipate problems, improve maintenance management, and reduce unplanned downtime.
The main identified failure modes of the CV-2C conveyor belt include the following: • Electrical failure: Problem related to the belt's electrical system; • Damaged idler: Damage to the idlers supporting the belt; • Alignment rod failure: Problems with the belt alignment rod.
To address the conveyor belt problem, several critical variables that affect its operation are identified and monitored.Monitored data are collected through a combination of physical and virtual sensors: ing crushing line (Error!Reference source not found.)that consists of a crusher where large pieces of material are reduced to smaller pieces which are later transported by conveyor belt "CV-1C" into feeder "FE002" which deposits the material into conveyor belt "CV-2C" that moves the load towards the rest of the process.
The conveyor belt CV-2C is a critical component in the crushing and material conveying line.Its correct operation is essential for maintaining the efficiency and productivity of the process.However, the conveyor belt has experienced several stoppages due to failures or operational decisions which are not informed in a timely manner.This has led to several delays when attempting to resume operation due to the unknown nature of the stoppage, which gives rise to the need for research on a classification model that can be later integrated into failure prediction systems, providing valuable input to anticipate problems, improve maintenance management, and reduce unplanned downtime.To address the conveyor belt problem, several critical variables that affect its operation are identified and monitored.Monitored data are collected through a combination of physical and virtual sensors: Physical sensors: Embedded into the system and monitored by a Distributed Control System (DCS), they directly measure operational variables such as motor load, speed, and power.These sensors are certified and regularly maintained by a third-party company.
Virtual sensors: Generated by Simulation Digital Twin software, these sensors provide belt stress information and operational status which are estimated from the DCS data through internal algorithms.
Key parameters obtained from the physical and virtual sensors include the following: Physical sensors: Embedded into the system and monitored by a Distributed Control System (DCS), they directly measure operational variables such as motor load, speed, and power.These sensors are certified and regularly maintained by a third-party company.
Virtual sensors: Generated by Simulation Digital Twin software, these sensors provide belt stress information and operational status which are estimated from the DCS data through internal algorithms.
Key parameters obtained from the physical and virtual sensors include the following:

Methodology
This section will detail the step-by-step process, including data collection and analysis, along with the key aspects required to achieve the desired results.Figure 4 depicts the overall process, where the elements in blue represent the processes and stages carried out in this article.The process starts with data collection from the DCS and Simulation Digital Twin software, followed by a rigorous data preprocessing and exploration development and optimisation.Once the data are prepared, two hybrid models are trained and compared to select the best classification model.
The transition from the classification models, shown in blue, to the "Failure Prediction Model", highlighted in green, underlines the continuity of this research from the current approach towards a future application in failure prediction.
Sensors 2024, 24, 5804 9 of 21 processes and stages carried out in this article.The process starts with data collection from the DCS and Simulation Digital Twin software, followed by a rigorous data preprocessing and exploration development and optimisation.Once the data are prepared, two hybrid models are trained and compared to select the best classification model.
The transition from the classification models, shown in blue, to the "Failure Prediction Model", highlighted in green, underlines the continuity of this research from the current approach towards a future application in failure prediction.

Data Collection
The article's general methodology commences with data collection.The database comprises data related to the history of equipment stoppages, as well as information concerning voltage variables and the operational status of the component.This information

Data Collection
The article's general methodology commences with data collection.The database comprises data related to the history of equipment stoppages, as well as information concerning voltage variables and the operational status of the component.This information is obtained from three sources, as represented in Figure 5: the Distributed Control System (DCS), a control infrastructure employed in industrial settings to oversee and manage various operational parameters [39]; the Digital Twin (DT) system, a virtual emulation of real-world systems or processes, reproducing variables and conditions [40]; and the maintenance database, which houses historical stoppage information.
Sensors 2024, 24, x FOR PEER REVIEW 10 of 22 is obtained from three sources, as represented in Error!Reference source not found.: the Distributed Control System (DCS), a control infrastructure employed in industrial settings to oversee and manage various operational parameters [39]; the Digital Twin (DT) system, a virtual emulation of real-world systems or processes, reproducing variables and conditions [40]; and the maintenance database, which houses historical stoppage information.The DCS informs the measures of the parameters shown in Table 2 while the DT informs accurate estimations of the parameters represented in Table 3, and the maintenance database reports the maintenance stoppages information described in Table 4.To ensure the quality of data collected from the maintenance and operations databases, quality control measures were implemented.These included data integrity verification procedures, such as duplicate checking and cross-consistency reviews with technical experts to ensure data accuracy and relevance.These practices sustain the credibility of the analysis and its findings.
By cross-referencing the databases, an extra Boolean label was included on the unified database that classifies all the stoppage events recorded during the period analysed in two main categories: operational stoppage and failure.
As shown in Table 5, eleven different classes of stoppages have been identified, of which three correspond to operational stoppages and are identified as equipment failure.

Data Preprocessing
After completing data collection, the next step in the methodology is data preprocessing.This article mainly divides this process into two stages: data cleansing and transformation.To better understand the key actions to be carried out in these stages, they can be visualised in detail in Figure 6.The database contains 511,265 historical data points, from which only 41,447 records correspond to stoppages of the CV-2C conveyor belt and will be considered for this analysis.In the first part of the preprocessing, data cleaning was performed on the database; this cleaning considered the removal of null or empty data, erroneous data, and duplicates from the database.As a result of this cleaning, the number of data points was reduced from 41,447 to 40,596 valid records.
With the selected data, the second part of the preprocessing is performed, where the search for outliers, which are defined as points lying three standard deviations away from the mean, was conducted.Each detected outlier is carefully examined in the context of domain-specific knowledge.This makes it possible to distinguish between measurement errors and legitimate operational variations and decide on the appropriate action for each case, either eliminating or correcting these values, thus minimising the risk of bias in subsequent analyses.
During the transformation phase, the data are normalised to improve the accuracy of the models and avoid scaling problems between the numerical entries of the database.In addition, Boolean entries were numerically coded for the use of the ML algorithms.

Data Exploration
An exploratory data analysis was performed on the database to understand the characteristics and possible patterns in the data.Key variables, such as component operating The database contains 511,265 historical data points, from which only 41,447 records correspond to stoppages of the CV-2C conveyor belt and will be considered for this analysis.In the first part of the preprocessing, data cleaning was performed on the database; this cleaning considered the removal of null or empty data, erroneous data, and duplicates from the database.As a result of this cleaning, the number of data points was reduced from 41,447 to 40,596 valid records.
With the selected data, the second part of the preprocessing is performed, where the search for outliers, which are defined as points lying three standard deviations away from the mean, was conducted.Each detected outlier is carefully examined in the context of domain-specific knowledge.This makes it possible to distinguish between measurement errors and legitimate operational variations and decide on the appropriate action for each case, either eliminating or correcting these values, thus minimising the risk of bias in subsequent analyses.
During the transformation phase, the data are normalised to improve the accuracy of the models and avoid scaling problems between the numerical entries of the database.In addition, Boolean entries were numerically coded for the use of the ML algorithms.

Data Exploration
An exploratory data analysis was performed on the database to understand the characteristics and possible patterns in the data.Key variables, such as component operating conditions and stress indicators, were examined to identify trends, correlations, and any relevant information that could contribute to the accurate classification of events.
Measures of central tendency, such as mean, variance, and median, were calculated to examine the behaviour of each of the variables.These measures provide important information about the data's characteristics.However, these statistics alone may not reveal complex relationships and patterns within the dataset, especially when dealing with multiple variables.Table 6 summarises the key statistical characteristics of the most relevant entries from the database.From these initial analyses, no clear conclusion could be drawn, which could be explained by relationships between the variables that are not readily apparent through traditional summary statistics and visualisations.
In addition, a correlation analysis revealed that most variables share some significant correlation, which is a problem when employing traditional methods such as linear regressions due to the negative effect that multicollinearity has on the variance.
However, correlation analysis assistance has limitations, as it focusses on pairwise relationships and may not capture higher-order dependencies or collinearities between multiple variables.

Training Model
Due to the high correlation between variables and the need to capture dependencies, PCA is employed, enabling dimensionality reduction in the dataset while preserving the most critical information.It computes and proposes a set of uncorrelated linear combinations of the original observations of the dataset to describe most of the variability of the data.
The next step is to split data for model training.This is performed randomly, splitting, 80% for training and 20% for testing.The division of the data into training and test sets was performed randomly to preserve the independence and representativeness of the samples.
An ANN was implemented composed of an input layer, two hidden layers and an output layer.The first hidden layer contains 64 perceptrons with a ReLU (Rectified Linear Unit) activation function, while the second hidden layer has 32 perceptrons also with ReLU activation.The output layer uses a sigmoid activation function for binary classification of the maintenance events.The model was trained for 1000 epochs.
For comparison with the ANN, an SVM algorithm was implemented.In this regard, 3 different kernels were tested with 4 different C-values each.All the steps of the methodology were implemented using Python 3.10.9.

Results and Discussion
This section presents results from the experimental data to evaluate the performance of the stoppage classification models.Individual results will be presented for each of the two algorithms, and a comparison will be provided.
Classifying between shutdowns caused by unexpected equipment failures and scheduled shutdowns is crucial for several reasons and enables organisations to optimise maintenance and operation strategies.Unplanned outages are often costly and disruptive, and the ability to predict them can prevent economic losses and ensure uninterrupted production.
The configuration of the equipment used to carry out this study is essential for understanding the context in which this research was developed and could be improved with greater capabilities.The setup for the computations of this investigation was a desktop computer running Python 3.10 with Windows 10 64-bits on an Intel Core i5-7200 processor, accompanied by a RAM of 16 GB.

Principal Component Analysis
Before presenting the results obtained with SVM and ANN, it is important to highlight that in this study Principal Component Analysis (PCA) was implemented as a dimensionality reduction technique in conjunction with these algorithms, the combination of PCA with ML can improve the accuracy and efficiency of the models [41], in addition to avoiding multicollinearity problems [42].PCA was applied to mitigate data complexity by removing redundant or irrelevant features, thus allowing a more efficient representation of critical features for failure event classification.
PCA was performed with four principal components explaining over 90% of the variance (see Figure 7).In addition to improving model accuracy, PCA helped significantly reduce processing times, particularly in industrial applications where operational efficiency and real-time decision making are imperative; when applying PCA, the results achieved using SVM and ANN are presented in Sections 5.2 and 5.3.
Sensors 2024, 24, x FOR PEER REVIEW 14 of 22 Before presenting the results obtained with SVM and ANN, it is important to highlight that in this study Principal Component Analysis (PCA) was implemented as a dimensionality reduction technique in conjunction with these algorithms, the combination of PCA with ML can improve the accuracy and efficiency of the models [41], in addition to avoiding multicollinearity problems [42].PCA was applied to mitigate data complexity by removing redundant or irrelevant features, thus allowing a more efficient representation of critical features for failure event classification.
PCA was performed with four principal components explaining over 90% of the variance (see Figure 7).In addition to improving model accuracy, PCA helped significantly reduce processing times, particularly in industrial applications where operational efficiency and real-time decision making are imperative; when applying PCA, the results achieved using SVM and ANN are presented in Sections 5.2 and 5.3.The graph in Figure 8 clearly shows the percentage of variance explained by each of the four principal components selected.The first four coefficients with their corresponding wights of the four selected principal components are presented in Equations ( 18)-( 21 To better understand the complexity of the operational data and how certain factors contribute to the overall dynamics of the system, a principal component composition analysis was conducted to elucidate the impact of operational factors on system dynamics.This analysis revealed that PCA tends to prioritise variables related to the level of use of the belt, such as the employed power, current, and produced torque, which is expected due to the nature of the process and the system under study.

Support Vector Machine
For this algorithm, an analysis of the values obtained for different kernels and various values of C was performed, where C controls the penalty for classification errors.The results obtained for different combinations are presented in Table 7.
In Table 7, the selected kernel for developing the model is RBF with a C of 100, which yields an accuracy of 94.94%.The confusion matrix of this setup is shown in Figure 9.  Table 8 confirms that even though SVM is capable of high accuracy, it does not excel in specificity, which may lead to a costly number of false positives.

Artificial Neural Network
After training under the same setup, the performance of the Artificial Neural Network (ANN) in classifying stoppage events is analysed.Figure 10 shows the confusion matrix regarding the performance achieved by the ANN in this task.The confusion matrix breaks down the classification performed by the model, where 5918 events were correctly identified as operational stoppages and 1893 as failures, demonstrating the algorithm's effectiveness in distinguishing between these critical categories.However, there were some significant misclassifications: 226 operational stoppages were incorrectly labelled as failures, representing Type I errors.This error suggests an over-sensitivity of the model towards detecting failures, which can lead to unnecessary and costly maintenance interventions.Conversely, 83 actual failures were incorrectly classified as operational stoppages, constituting Type II errors, posing risks to safety and reliability by not performing critical maintenance as needed.The confusion matrix in Figure 9 reveals critical aspects of the model's performance in classifying stoppage events.Among the analysed events, 5917 were correctly identified as operational stoppages and 1792 as failures, demonstrating the model's ability to discern between the two types of stoppages.However, significant deviations were observed: 254 operational stoppages were erroneously classified as failures, constituting type I errors or false positives.This type of error can lead to unnecessary maintenance actions, increasing operational costs.Additionally, 157 actual failures were not detected and were incorrectly classified as operational stoppages, representing type II errors or false negatives, which could result in a lack of necessary maintenance and potentially critical equipment failures.The precise distinction between these types of errors is fundamental for the continuous optimisation of the maintenance process and the improvement in the model's reliability.The other metrics that will be used to evaluate and compare the model are presented in Table 8.Table 8 confirms that even though SVM is capable of high accuracy, it does not excel in specificity, which may lead to a costly number of false positives.

Artificial Neural Network
After training under the same setup, the performance of the Artificial Neural Network (ANN) in classifying stoppage events is analysed.Figure 10 shows the confusion matrix regarding the performance achieved by the ANN in this task.The confusion matrix breaks down the classification performed by the model, where 5918 events were correctly identified as operational stoppages and 1893 as failures, demonstrating the algorithm's effectiveness in distinguishing between these critical categories.However, there were some significant misclassifications: 226 operational stoppages were incorrectly labelled as failures, representing Type I errors.This error suggests an over-sensitivity of the model towards detecting failures, which can lead to unnecessary and costly maintenance interventions.Conversely, 83 actual failures were incorrectly classified as operational stoppages, constituting Type II errors, posing risks to safety and reliability by not performing critical maintenance as needed.These results highlight the importance of refining the model's accuracy to minimise both types of errors, thereby ensuring the effectiveness and efficiency of the maintenance process.The results obtained for the performance metrics regarding the ANN are presented in Error!Reference source not found..In general, performance is improved with respect to SVM.The ANN was able to improve the scores obtained in precision and specificity, hinting that false positives were less common when using an ANN.

Processing Time
Another aspect of relevance in the analysis of the results is the processing time required by each of the classification algorithms when PCS is conducted.This factor, fundamental in evaluating the proposed solutions' efficiency, is presented in detail in the table showing the execution time before and after applying the PCA for the ANN and SVM algorithms.Reducing processing time, one of the key advantages of PCA, becomes a prominent element in evaluating the feasibility of implementing these models in an industrial environment.These results highlight the importance of refining the model's accuracy to minimise both types of errors, thereby ensuring the effectiveness and efficiency of the maintenance process.The results obtained for the performance metrics regarding the ANN are presented in Table 9.In general, performance is improved with respect to SVM.The ANN was able to improve the scores obtained in precision and specificity, hinting that false positives were less common when using an ANN.

Processing Time
Another aspect of relevance in the analysis of the results is the processing time required by each of the classification algorithms when PCS is conducted.This factor, fundamental in evaluating the proposed solutions' efficiency, is presented in detail in the table showing the execution time before and after applying the PCA for the ANN and SVM algorithms.Reducing processing time, one of the key advantages of PCA, becomes a prominent element in evaluating the feasibility of implementing these models in an industrial environment.
The processing time results are positive, as a drastic decrease in the time required by both algorithms is observed after applying the PCA.Compared to previous runtimes, a substantial reduction of more than 40% in the time required to complete the classification tasks has been achieved.The optimisation of the processing time, due to PCA, is presented in Table 10.

Comparison
Confusion matrices for the ANN and SVM models provide essential insights into the effectiveness of these models in classifying maintenance events and the types of common errors that arise during classification.Classification errors manifest as false positives (FP) and false negatives (FN).An FP occurs when the model incorrectly predicts a failure in an operational shutdown instance; conversely, an FN occurs when an actual failure is misclassified as an operational shutdown.These errors are critical as they can lead to unnecessary maintenance or operational shutdown instances needing maintenance, thereby increasing costs and unplanned downtime.
Specifically, the SVM model recorded 254 false positives (Type I errors) and 157 false negatives (Type II errors), while the ANN model showed 226 false positives and 83 false negatives.These differences suggest that, although both models are effective, the ANN has a slightly better rate of reducing Type II errors, which is crucial for avoiding skipped maintenance.This reduction in false negatives is essential for ensuring that all necessary maintenance tasks are performed, and no critical failures are overlooked, thus maintaining operational integrity and safety.
Regarding the performance metrics, Table 11 presents the results from both classification algorithms applied.It is observed that, overall, the ANN performed better than the SVM.In accuracy, the ANN shows more favourable results where 96.2% of the analysed stoppages are correctly categorised.For SVM, the specificity is 91.9%, and for ANN, it is four points higher at 95.8% indicating that ANN is more effective in accurately rejecting non-failure events, minimising the unnecessary interventions and maintenance that could result from incorrectly flagged operational stoppages.This capability is crucial in industrial settings where false alarms can lead to costly and disruptive maintenance actions that are not needed.The precision of the ANN scored 98.6% in the correct categorisation of failures while the SVM scored 97.4%, showing that ANN is also better suited for reducing false positives.For recall, the ANN scored 96.3%, indicating that the ANN is more effective in capturing most of the real fault events, minimising false negatives.For the F1-score metric, neural networks also show better results with 97.4%, which implies a good balance between accuracy and recall, indicating fewer errors in fault classification and stoppages.Another critical aspect to be evaluated to achieve this article's objective is the selected model's processing time.In this regard, the ANN has shown more favourable performance than SVM as it outperforms SVM in every score from the selected metrics, while at the same time achieving lower computational times.This is not even considering that ANNs are also optimised to run on graphics cards which may reduce the processing time of ANNs into a fraction of the original time.
Since this article aims to facilitate real-time implementation of the classification model, ensuring time efficiency is crucial.This efficiency allows for swift decision making, which is essential for maintaining operational continuity in industrial settings.
The ANN can learn complex patterns and relationships in the data, enabling accurate and dynamic classification.Combining PCA with ANN facilitates data dimensionality reduction, improving model efficiency and reducing processing times without sacrificing precision and accuracy.Predicting unplanned outages is critical to asset management and operational efficiency.This allows informed decisions to be made in real time and paves the way for developing more advanced failure prediction models.These models can effectively classify maintenance events within the copper-crushing line equipment context.Moreover, ANN's can successfully classify stoppages in a very short time, which may be the first step into the development of a real-time prediction tool.
In the context of this case study, the results highlight the superior performance of the ANN, which consistently outperformed the other algorithms in all the evaluated metrics.With accuracy, precision, recall, and F1-score values consistently exceeding 95%, ANNs demonstrate their remarkable ability to identify and categorise stoppage events while minimising misclassifications accurately.This exceptional performance underscores their potential as a valuable tool for similar tasks within the industry.

Conclusions
This article presents and evaluates the application of two ML algorithms-Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs)-together with Principal Component Analysis (PCA) to create a hybrid model.Both the tested approaches effectively classify the nature of the stoppages within the copper-crushing line equipment context, ultimately serving as a foundational component for future failure prediction models.This forward-looking approach holds significant potential for maintenance practices within the mining industry and across various sectors that rely on sensor data and stress-related variables for equipment.
The importance of this application in the mining industry is highlighted by its ability to improve the efficiency of existing processes, which have historically been plagued with maintenance management problems due to manual data collection, resulting in errorprone data.By focussing on classification, based solely on data generated by operational sensors, this research provides more reliable information for prediction.As such, it lays the foundation for creating a more reliable failure prediction model, thereby reducing dependence on external sources and mitigating potential problems such as inaccuracies, data gaps, and outdated information.Ultimately, this methodology has the potential to significantly improve operational efficiency and reduce costs in the mining industry, consolidating its relevance and positive impact on the sector.
In the context of this case study, the results highlight the superior performance of the ANN, which consistently outperformed the other algorithms in all evaluated metrics.With accuracy, precision, recall, specificity, and F1-score values consistently exceeding 95%, ANNs demonstrate their remarkable ability to identify and categorise stoppage events while minimising misclassifications accurately.The specificity achieved in the models, particularly the ANN's superior ability to correctly reject non-failure events as non-failures, demonstrates the model's effectiveness in avoiding unnecessary maintenance actions and underscores the reliability of the classification system in operational environments.
Another critical aspect examined in this article is the processing time of the selected model where the ANN proved to be more time-efficient than the SVM, a significant advan-tage given the real-time implementation required for effective classification.Furthermore, as mentioned, current implementations of ANN can leverage the computational power of graphic cards (GPU) to reduce the computational times by fractions of the processing time in a conventional CPU.
In light of the encouraging results of this research, it is strongly recommended that this methodology and model are implemented within copper-crushing line equipment and their extension to other equipment is explored.This methodology effectively leverages the benefits of ML, enabling the processing of substantial data volumes in real time and enhancing early event classification, thus contributing to significant process efficiency improvements.
In conclusion, this study has successfully developed a hybrid ML model capable of identifying and categorising unscheduled stoppage events in the equipment database of a Chilean mining company.This achievement not only fulfils the primary research objectives but also lays the foundation for the future integration of this methodology into advanced failure prediction models.As a result, it offers the potential to enhance the reliability of preventive maintenance practices, ultimately improving the operational efficiency and cost effectiveness of industrial processes.

Figure 1 .
Figure 1.Representation of an Artificial Neural Network with two hidden layers.

Figure 1 .
Figure 1.Representation of an Artificial Neural Network with two hidden layers.

Figure 2 .
Figure 2. Representation of an SVM classification.

Figure 2 .
Figure 2. Representation of an SVM classification.

Figure 3 .
Figure 3. Mining crushing line.The main identified failure modes of the CV-2C conveyor belt include the following: • Electrical failure: Problem related to the belt's electrical system; • Mechanical failure: Wear or damage to mechanical components; • Pulley misalignment: Misalignment of pulleys; • Belt cut: Physical damage to the belt; • Belt misalignment: Misalignments in belt alignment; • Damaged idler: Damage to the idlers supporting the belt; • Alignment rod failure: Problems with the belt alignment rod.

Figure 5 .
Figure 5. Different sources of this study's database.The DCS informs the measures of the parameters shown in Error!Reference source not found.while the DT informs accurate estimations of the parameters represented in Error!Reference source not found., and the maintenance database reports the maintenance stoppages information described in Error!Reference source not found..

Figure 5 .
Figure 5. Different sources of this study's database.

Figure 7 .
Figure 7. Number of main components.The graph in Figure8clearly shows the percentage of variance explained by each of the four principal components selected.

Figure 7 .
Figure 7. Number of main components.The graph in Figure8clearly shows the percentage of variance explained by each of the four principal components selected.

Figure 7 .
Figure 7. Number of main components.

Figure 8 .
Figure 8. Contribution of principal components to variance.Figure 8. Contribution of principal components to variance.

Figure 8 .
Figure 8. Contribution of principal components to variance.Figure 8. Contribution of principal components to variance.

Figure 9 .
Figure 9. Confusion matrix for SVM classification with RBF kernel and C = 100.

Figure 9 .
Figure 9. Confusion matrix for SVM classification with RBF kernel and C = 100.

Figure 10 .
Figure 10.Confusion matrix of the ANN.

Table 2 .
Parameters reported by the DCS.

Table 3 .
Estimations reported by the DT.

Table 4 .
Maintenance database stoppages description entries.

Table 6 .
Exploratory data analysis summary.

Table 7 .
SVM accuracy for different kernels and values of C.

Table 8 .
Performance metrics for the selected SVM model.

Table 8 .
Performance metrics for the selected SVM model.

Table 10 .
Classification processing times, in seconds, with and without conducting PCA first.

Table 11 .
Performance metric comparison for SVM and ANN.